Strategic Attentive Writer for Learning Macro-Actions
نویسندگان
چکیده
We present a novel deep recurrent neural network architecture that learns to build implicit plans in an end-to-end manner by purely interacting with an environment in reinforcement learning setting. The network builds an internal plan, which is continuously updated upon observation of the next input from the environment. It can also partition this internal representation into contiguous subsequences by learning for how long the plan can be committed to – i.e. followed without re-planing. Combining these properties, the proposed model, dubbed STRategic Attentive Writer (STRAW) can learn high-level, temporally abstracted macroactions of varying lengths that are solely learnt from data without any prior information. These macro-actions enable both structured exploration and economic computation. We experimentally demonstrate that STRAW delivers strong improvements on several ATARI games by employing temporally extended planning strategies (e.g. Ms. Pacman and Frostbite). It is at the same time a general algorithm that can be applied on any sequence data. To that end, we also show that when trained on text prediction task, STRAW naturally predicts frequent n-grams (instead of macroactions), demonstrating the generality of the approach.
منابع مشابه
From Basic Agent Behavior to Strategic Patterns in a Robotic Soccer Domain
The paper presents an algorithm for multi-agent strategic modeling (MASM). The method applies domain knowledge and transforms sequences of basic multi-agent actions into a set of strategic action descriptions in the form of graph paths, agent actions, roles and corresponding rules. The rules, constructed by machine learning, enrich the graphical strategic patterns, which are presented in the fo...
متن کاملA Method for Learning Macro-Actions for Virtual Characters Using Programming by Demonstration and Reinforcement Learning
The decision-making by agents in games is commonly based on reinforcement learning. To improve the quality of agents, it is necessary to solve the problems of the time and state space that are required for learning. Such problems can be solved by Macro-Actions, which are defined and executed by a sequence of primitive actions. In this line of research, the learning time is reduced by cutting do...
متن کاملRoles of Macro - Actions in Accelerating Reinforcement
We analyze the use of built-in policies, or macro-actions, as a form of domain knowledge that can improve the speed and scaling of reinforcement learning algorithms. Such macro-actions are often used in robotics, and macro-operators are also well-known as an aid to state-space search in AI systems. The macro-actions we consider are closed-loop policies with termination conditions. The macro-act...
متن کاملPlan, Attend, Generate: Character-Level Neural Machine Translation with Planning
We investigate the integration of a planning mechanism into an encoder-decoder architecture with attention. We develop a model that can plan ahead when it computes alignments between the source and target sequences not only for a single time-step, but for the next k timesteps as well by constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or...
متن کاملPlanning with Closed-Loop Macro Actions
Planning and learning at multiple levels of tempo ral abstraction is a key problem for arti cial intelli gence In this paper we summarize an approach to this problem based on the mathematical framework of Markov decision processes and reinforcement learn ing Conventional model based reinforcement learning uses primitive actions that last one time step and that can be modeled independently of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016